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lm FIELD OF INVENTIOM 
The present invention relates to novel dma 
cassettes and a »ethod of capping genetic JerTal £ 

-luding incorporate T^lll ^1^' 
sequences into genomic dna, generation of rare restriction 
enzyme cutting sites, and size resolution of LIT 
up to and greater than the nil iion b ase pa r size 
10 These technologies can be used to create Y! ! * 

~. T . gmmio MP „ ^ 7:: r wic 

compared to eubs.qu.ntly generated «p. of e . Uu ?„ 

or alteration, in the prljy l^eT 

16 " Ueh "* Pt *"* method l^thu. 

of detecting ,. M tie disorders, di...../ 
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J ' 1 - ro "' D ■»nwwn Wm , m .... „.,. 

Ov.r the pmet several y.arn, the ability Z 

r-olv. o*. of large , iIa ... Md . a p^^tider 

orttianiir » I** MqU * nCtn3 0t «- t— T 

The *evston. to this adv™. 

and Mucins, use) . These techniques u. tt poseible to 

greater in si., than vas po..ibl. through historical 
•lectrophoretio technique., m. independent dev.lop.ent of 
t— to generate e«r„.lv rare restriction .ZyZ 
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cleavage specif ities In vitro allows for the generation of 
large restriction fragments (McClelland et al^, Proc. Kati. 
Acad. Sci. usa 81:983-87, 1984; New England Biolabs Catalog 
1985-1986, p. 29ff ) . The mechanism described therein relies 
on -he use of a methylase which methylates DMA at specific 
-ites rer sizable subsequently by a restriction enzyme 
which cleaves UNA only when it is methylated at the 
restriction enzyme cleavage sequence. Appropriate choice of 
the correct methylation system therefore allows generation 
of very large restriction fragments. 



2 ' 2 ' SIZE OF GENOMES AS A TECHNICAL ANALYSIS PROBLEM 
The problems encountered in analyzing large 
genomes cannot be overstated. Per example, the human genome 
is approximately 3xl0 9 base pairs in length covering an 
estimated 3300 centimorgans (White et al^ Nature 313:101- 
105, 1985). Estimates of the number of marker loci needed 
to span this genetic length range upwards from 100 (Lange, 
K. and Boenhke, M. , Am. j. Hum. Genet. 34:842-45, 1985; 
20 Botstein S£ fiiif Am. J. Hum . Genet. 32:314-331, 1980) . In a 
paper on how to generate human linkage, it has been stated, 
referring to the effort needed, -This will be a large scale 
endeavour and high efficiency of data collection will be 
important*. (White et al^, supra at 101) . Current 
B estimates of Jpir long it will take to construct a linkage 
aap of human DNA range from 2 to 5 years assuming a great 
deal of effort from many researchers is combined (Lewin, R., 
Science 233:i57-sa. 1986). 

hn effective map suitable for general diagnostic 
^ and prognostic testing would require far more information 
than the 100 markers cited above. Ideally the map would 
have markers spaced every 50 kilobases of DNA or less, or 
would consist of upwards of io 4 -lo 5 markers. Generation of 
this many ordered markers is not feasible using current 
techniques . 



35 
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Hh " a KSnmMe *"•"> »een mad. i„ eonatruetina 
'"" ia i-Pl.~nt.tion of various „=l„cul.r 

ZT*T:T S : " pros,nt — *• — » ~ a 

9«.a. spannin, only a mi portlon of ^ 

Sfi. IMS) . Th^a clone<1 ,„„ tave bw , - 
M-tlflr raatriction tn^t langth polyBotphl J ^ 

—Tl? * h * V ° PrW ' n *» -l.gno.ln, aoa. 

!""!" C d i a0rd * ra - DM » it « utility, th. inf OCT ation 

10 T~ . * ttXm9b **• Uaa o£ —* i. United 

S a=l« T,^ SPar - " » * «" »<*» ~ 

? or °T.tt ?- U U necessary to UoUt. . din.™ probe, a 
for each diaana. A dearly banafieial diagnoatic tool 
«ould thTafor. b. th. ablHty to rapidly scr ..„ ^tir. 

^without th. n.=.„i ty of Lolatin, prob.,., tot ~^ 

J"™""" 151 * " *• « «*3.ct of th. instant 
invention to b. abl. to soreen an .ntir. ,.„„. of a e.ll or 
jD organism rapidly. 

Ration to b. abl. to aenerat. a Mp or „,„„ of . 

cell', er organisn's genomic DKA. 

It i. anoth« obj.ot of thi. inv.ntion to b. able 

* m^!T^ «"*«—• •»*— »««ie maps 

generated by this method. 

It is a further object this invention to use 
toe comparative information thus generated to locate 
identify, or define genetic lesions, mutations, insertions, 
30 ™ ^ defeCtS and Polymorphisms in genomic 

UNA* 
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. It is still a further object of this invention to 
use the instant invention as a diagnostic or prognostic tool 
in order to detect genetic defects, such as, but not limited 
to, prenatal diagnosis of genetic aberration, inherited 
5 genetic disease, and irduced or acquired genetic disease. 

Still other objects and advantages will become 
readily apparent form the following description and claims. 

3. SUMMARY OP THE INVENTION 
.10 present invention involves a method and 

biological tools for mapping genomic DNA. This mapping 
• technique can be used as a diagnostic test for detecting 
genetic disease and polymorphic loci and as a prognostic 
test. 

15 The present mapping method comprises integrating 

DNA of the organism whose genome is to be mapped with a 
cassette of DNA containing a rare restriction sequence or 
site, the rare sequence or site being flanked on one or both 
sides by a unique DNA sequence. By 'rare restriction 

^ sequence or site" it is meant one which does not occur, or 
occurs at very low frequency in the genome to be mapped, or 
can be made to be cleaved preferentially over genomic sites 
by any means rare, so long as its frequency allows partial 
napping away from the rare restriction sequence into the 

2S 9WM f a o£ ^ 'unique DNA sequences", 

hereinafter referred to as DNA A, and optionally, DNA B, 
_ need not be restriction sequences,- but rather are simply 
sequences capable of being identified uniquely in a 
badJB|round of host cell genome, which preferably do not 

30 occur in the host cell genome, and which differ from each 
other as well as from the rare sequence, m one embodiment, 
the cassette is inserted into the host cells by way of a 
vector, preferably a vector which will accomplish gene 
transfer through a single random integration of the 

35 cassette into the host cell. Independently derived, 
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Bonotonically integrated clonal isolates ar* t han 
and analyzed. The unim,- nt» iSOiatea are then examined 
restriction^ ^ * am *" B0 *O flanking the rare 

restriction sequence serves as a marker within «-s ! 

, 0 id™tifiabia DBA ^enc, to th. rara^sLct, 

■ itas on „ch ar. measured, and than, by 

f"^ 3 batVe " ~^«i 0 n aite. Lcuieted 
Prom this information, a regional restrict™ - 
1S constructed, repetition o/Ls J^Sm^^ 
construction of a total genomic nap, by ree Zn7^ ? 
secondary restriction pattern overlap ° f 

co«nri fil 1316 lnVention also novel DNA cassettes 

comprising a restriction sequence rare 

test-on «t „..,„_, * M re for the genome to be 

jo fstad, nantad on at ieast on. .id. ^ , nueleotl(le 

o!H«» : Pr ° Vld * 4 *" n °™ 1 =ntai„I„ g tl 

= ln lnt * 9r " in « «" «»«tta int. th. host 

celi genome. Th. nature of tho casaatt. vector will 

* ZZ ^" <toS *" din9 -«*—-« thetanJ 1 . ha 
dcTlot naT! '« »y -.home which 

"Ration, an* .» app„pri.te genomic 

^"al^lT « c-atf co.pri.in, . 

" orl^ ^L! * * ntrmiM "*»■>«• Exa^ia, of 
organism, which would fall i*> _ 
ara not . Uch * eat ^<>ry include bat 

«. not linltad to — !. (humane,, bird., „„ DrosophU., 
this casaatt. can ba transmitted to th. ho,t «U by way or 
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non-waential region of the defective ret«*H«, 

oUgonueLotid. restriction ait. pi. , 1M9rt «» 

sector co.^^; 

=opi« of th. HotI reception IZlZ "* 
'° ,„ , Pi?U ™ ' ° Ch,,Mt;i " Uv llK-tratM » procadur. 

s * wwnt onc « • ~P i« contracted. Dascription or T 
— b- 1. found in Saetlon 5.,, infra °' 

mm,t,„- . Pl9,Ira * iUuatrit « s U>« Pwaonc. and ta , ltu 

; 1 " lMtol " 811,4 111 ««*«« l»« * - 

^.V-"„ H .' D r ' ^ 5 - ■*»"» «■»- «ntrol, 
6 *** N «° ninus control. The lower 3 3 m 

St LTL^' PBQ " aaCti °" ° f «■—««- DMA. „ot. 

that both BaaHl^ x diction and MCla I: op nI , SatI 
ai 9 ~tion pwduc. th. .... ^ 

y-rt chro^ca on th. outsid. 1». Jf" 
««*««o.. tv.lv. distinct band. >r. r.solv«d in the 

30 D * naa "present doublets). 
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5 « DETAILED DESCRIPTION OP THE INVENTION 

5.1. CASSETTE CONSTRUCTION 
in accordance with the present invention, genomic 
5 DNA can be mapped by inserting into said genomic DNA a OHA 
sequence which is a rare cleavage site in the context of the 
host- DNA Uth which it is integrated. This rare restricts 
sequence. is flanked at one end by a uniquely identifiable 
DNA sequence, termed unique DNA A, and optionally e*. the 

to cth «* «nd by a second uniquely identifiable DNA sequence, 
termed unique D»*A B. These unique DNA sequences can be 
natural or synthetic. This combination of sequences, i.e., 
the rare restriction sequence, unique DNA A and optional 
unique DNA B, is termed a cassette. Optionally the cassette 

15 *»y contain a sequence or sequences which facilitate 
subsequent isolation of the genomic DNA flanking the 
cassette.; Por example, the cassette may also contain a high 
fj affinity protein binding site. In one embodiment, the x 

repressor binding sequence can be used in conjunction with a 
DNA affinity column composed of covalently bound repressor 
monomers. In this way, DNA fragments containing the unique 
DNA A or unique DNA B sequence can be readily isolated for 
subsequent manipulation. In another embodiment the cassette 
can contain genetic functionalities that allow it to be 

jg maintained as a pla&Aid in Ej. coll or other appropriate 
host, thus facilitating ready isolation of the unique DNA A 
or unique DNA B and flanking genomic DNA for subsequent 
m a nip u l ation. 

The actual sequences of the rare restriction 
sequence, unique DNA A, and unique DNA B are not critical. 
These sequences, however, should be different from one 
another, and, in a preferred embodiment, the sequences 
should occur infrequently, if at all (i.e., they are 
underepresented) in the host genomic DNA. It is possible 

gg that a similar sequence or sequences exists in the host 
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or^niaa. All that is required in this procedure is to 

differentiate between the inserted DMA and It V 

ONA. DNA and the endogenous 

5 5.1.1. RARE RESTRICTION stto 

di„ „ ^ identK/ ° f tte rare friction site wi 1 

^zjtt 9 upon host or9anis - — — 

*• mapped, since a particular sequence may be rare in one 
organism, but not another. The term -rare- i„ T 
10 ^ - be de,ined operational"" An tnltLT^ 

win bTtte° r ' ^ Ch ° 0Sing *" approDria ^ sequence S what 
will be the preferred fragment size resulting from 

^Preferred fragment size is not dieted b'y L y ^ 

particular requirement, but rather is a matter of 

15 convenience: the larger the fragment size produced, 

ZZ'i: 0Mier ****** ~^ «i i. 

Large is, of course, determined relative to the total size 
of the genome to be capped, smaller fragments are just as 
acceptable functionally, but require manymore reLtitw 

20 ZTl proc T a in order to ~ - ^v y a xl ^ £ 

this reason, large fragments are preferred. 

once a general determination is made as to a 
fragment size which would be acceptable for tne purges of 
the genome under consideration, the choice of a ^ Ztll 

trjatth. DHA with an appropriate enzyme (appropriate to Z 
d^n^below, and observe the size of the fragments 

J!.^ 18 *» —dance 

30 been chosen, and may be used in the present procedure 

the select!™ T * ^ IP—* to 

the selection of a sequence may be desired. i„ such a case 

an appropriate sequence may be predicted empirically by 

VJlTT t° ° Vera11 composition of the 

35 genome to be mapped. For example, a general knowledge of 
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the approximate GC content of the genome provides a 
convenient means by which the expected average fragment 
size, in Lase pairs, generated by cleavage at any given 
restriction site can be predicted. Information relating to 
5 GC content of various organisms is readily available in the 
literature (e.g.. Hill, J. Gen. Microbial . 44:419-437, 
1966) , or is readily determinable by known techniques' (Owen, 
R.J. and Pitcher, D. (1985) In "Chemical Methods in 
Bacterial Systematics* pp. 1-15, Edited by M. Goodfellow and 

10 D.E; Minniyin, Academic Press, London) . Given the fraction 
of total DMA which fs GC, AT content can also be determined 
(fraction GC + fraction AT - l) . Assuming random order of 
dinucleotidea/trinucleotides, then average fragment size 
(APS) generable by cleavage of a given recognition sequence 

15 can be calculated by the following formula 



AFS - (| )* (f 

r l 1 r i 



b 

) 



20 



where r x » fractional GC content 
1-*! • fractional AT content 
a ■ # G + C in recognition sequence 
b ■ # A + T in recognition sequence. 



Por «aeample, assume .40 G+C content and .60 A+T 
consent. The sequence of choice is ATCGAT. In this case: 
r l " * 4 * ■! 2 

lTj ■ .6 b ■ 4 
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inserting these values in the formula, 
r_2\2 , 2,4 

( .4> <T6> " (25) (123.5) - 3086 base pairs 

5 „, ThUS ' tte aVSrage f «9»ent size produced by 

cleavage of the restriction sequence ATCGAT in this genomic 

-nitlal estimation of the desxred fragment size for the 

ZZl l C T Ce ' U 13 rMdlly 4PParent Whet >~ « not the 
w chosen site is acceptable for the purpose. 

The above schemes are not the only methods by 

^iir Pr °!K iate i 9StriCti0n SeqUenCe Can »» chosen! but 

^ d ' Wl11 be readUy t to those 

skilled in the art. similar equations have been previously 

summaries of rare v. common sequences are available in the 
literature (McClelland et^ i„ Gene Amnli,^ 

SS^' (6d0 ' Science Publishing Co. , 

skilled artisan can routinely make an appropriate selection 

Intere^d!^ 10 " ^ ~ ~ * * - 

^ eBb ° dlBent ' tte selection of a rare site 
can be taken an additional step, by modification of the 

25 ZTT* t! * ^ ^ At even I— litely to be 

cat than it would be in its unmodified state. Elective 
-•thylation of a particular sequence, for example, may, 
depending on the organism, result in the production of a 
Wghly specific cleavage site which is only rarely cut in 

30 M7 'wbT* ^ Ch ° iCe (MCCleUand ^ PNAS USA 83. : 983- 

Altemately, the chosen cleavage site can be 
arranged in tandem arrays. This will normally result in a 
preferential cleavage of the chosen site within the 
^ cassette, over cleavage of the same sequence in a genomic 
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nit*, in turn producing greater efficiency in large fragment 
production. Similarly, incorporation of a DNA molecule 
modified to cleave, such as a D-loop, or a triple helix 
(Strobel et al^, J. Am. Chen. Soe. 110:7927-7929, 1988) will 
5 achieve substantially the same effect. These are but a few 
jxakpUs however, a.id any modification of the chosen 
sequence which ultimately aids in the generation of 
appropriate fragment sizes by selectively concentrating 
cleavage at the site of cassette insertion is contemplated. 

10 In ; ««»• ©' modification of the sequence 

selects*, the initial sequence need not even be particularly 
uncommon in the host genome, but merely need to be 
modifiable in such a way as to render them 'rare* in the 
present context^ 

16 In °n« embodiment, when a human genome is being 

mapped, a preferred sequence can be the overlapping 
Clal/ClaZ sequence 

ATCGATCGAT 
TAGCTAGCTA. 

20 Thi " * ito 18 of Particular utility since it is subject to 
selective methylation by the enzyme MClal (McClelland et 
al^., BHAS USA 81:983-987, 1984) . This methylation renders a 
rare 10-base sequence cleavable, since mammalian DNA is not 
routinely methylated at Clal. The methylated 10-base Clal 

25 sequence is subject to selective cleavage by the restriction 
•ndonuclease Dpnl (or cful) , which cuts only DMA which is 
methylated at adenine in both strands of the recognition 
sits; An additional benefit can be obtained by constructing 
«ite, within the cassette, in tandem repeats. 

30 Surprisingly/ the efficiency of cleavage within the cassette 
appears more than additively increased when compared with 
cassettes 'containing a single copy of the sequence. 

The selected oligonucleotide restriction sites 
can fee readily prepared synthetically. 

35 
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5 * 1 - 2 ' PNIQPE DMA AND VECTOR fiPT.p^ T n„ 

The purpose of the uniquely identifiable DMA 

S!^? 9 reStriCUOn 8ite 18 t0 P^ide a basis for 

detecting the cassette amidst the genomic DMA. i„ order to 
5 fulfill this purpose, unique DNA A and unique DNA B need 
only be distinguishable, by some detectable means, from tin 
host DMA. To this end, one may synthetically generate 
sequence- which are, based on knowledge of the overall 
composition of the host genome, expected to occur only 

10 rarely, if at all, in the genomic DMA. Alternate:/, the 
sequences can be generated by fragmentation an«- isolation of 
genomic DNA de-ived from an organism genetically distant 
from the host organism. For example, for mapping eukaryotic 
genomes, unique DNA can be derived from procaryotic i.e 

16 bacterial or viral, genomic DNA. The unique DNA is then' 
detectable by virtue of, for example, hybridization with a 
labelled complementary DNA probe, or the presence of a 
selectable marker. 

In a preferred embodiment, the unique DNA 
2,, sequences are chosen in association with a vector used to 
transform the host cells, in other words, the vector chosen 
is preferably one sufficiently distinct genetically from the 
host cell to permit detection of the vector DNA after its 
integration into the host cell genome. For example, in one 
25 embodiment a replication-defective amphotrophic virus 
vector, into which the rare restriction sequence has been 
inserted, such as that described by Sorge et al,. ( Mol. cell 
S4ol. 4.1730-1737, 1984), is used to infect a mammalian cell 
lin«. These viruses are capable of infecting cells, but 
30 ° nc * 9«nomically integrated, are incapable of post-insertion 
replication, preventing reinsertion by the virus into other 
segments of the genome. 
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5.1.3. CASSETTE INTEGRATION 
The cassette constructed, as outlined above, must 
be Integrated with the genomic DNA to be mapped. This 
integration can be achieved by any method useful in 
5 attaining DNA transfer; this includes, but is not limited 
to, the use of electroporation, micrinj action, infection or 
ligation into a cloning vector. The term 'integration* in 
the present context means the association of the cassette 
with the genomic DNA in a continuous piece of DNA. 
10 ln a Preferred embodiment, however, the cassette 

is integrated into the genomic DNA to be analyzed by use of 
a vector which inserts the cassette into a host cell. Por 
the present purposes, the vector is preferably a 
transposon-like element, i.e., one capable of being 
integrated into the genome essentially *at will*. The use 
of a vector provides a convenient source of uniquely 
identifiable DNA: insertion of the rare restriction 
sequence into a properly chosen vector automatically flanks 
the,restriction sequence with a distinct sequence which, 
upon integration into the host cell genome, will be readily 
detaictable* P*°Vidad the vector sequence is sufficiently 
distinguished from the host cell. 

Appropriate vectors for a variety of different 
cell types are readily available. For example, for DNA 
25 insertion into prokaryotic cells, the utility of 

transposable elements has long been recognized (Kleckner, 
CeJ£ ii:ii*23, 1977; calos et al^,- cell 20:579-595, 1980), 
and provide a means for sequence neutral integration of the 
type necessary to attain insertion of the restriction site. 
Ty elements of yeast are also similar in structure, and to 
some extant, function, to the prokaryote transposons (Boeke 
at ali, cell 40:491-500, 1985). m prosophila, P elements 
have been routinely used to introduce cloned sequences into 
the organism (Steller et al^, EMBQ 4:167-171, 1985). In 
3g higher plants, the use of Ti plasmids to introduce exogenous 
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DM* i. now routiwly routln. toehnolooy (ehilton » . 

Sli.. Qll 11.2S3-J71 19771 .~ , 1 (Chllt0n - «• •£ 

. ». -.IT ^ r lnt.,r.t io „ ofX'^ VltU " V * et ° r> "» 
choice for v.ctoro will be wlti Victor »- Alt.n»t« 

Mtendod to convoy that, by soma nun. - 
,0 =»«tt. by dotation of ml £Zl ' T""V' *" 

vector which is ger.jticallv diet.** * f a 

«-„„,„„ TOe casse «e can be constructed so as to 

Include a particular selectable marker which IT, 
identification *.w »««er which allows the 

ntincation of the presence of cassette DMA. 

20 5 - 2 ' PROPAGATE TRAHSm^ 

In the case in which the cassette DMA i« 

AOflS * this can be accomDlisheH w v 
2S "l"i=n or cu .ortin,. Z £ 

what W nece88ary 8o m to ^ ^ aaae * 

inteorat* r^ 16 ' *"* a vlwl to 

30 integrate cassette DNA into BaanaUan ceii« 

35 -.9., »tnb.. s . and T9nin , M-f hoi^^ .jf^: 
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2249, 1983; Mann at al,., cell 33:153-159, 1984; Sorge et 
al^Mol. call Biol. 4:1730-1737, 1984). This can be 
achieved by insertion, with the vector, of a selectable 
marker, such as antibiotic resistance, similar screening 
5 procedures can be achieved with whole organisms, e.g., whole 
p-ants. Here, vectors can also be constructed to carry a 
selectable marker, and plant cell cultures transformed 
thereby. Plants regenerated from culture can be screened on 
a selective medium at an early stage of development, and the 

10 surviving plants represent those which have integrated 

' cassette DNA. 

The above technique will result in clonal 
populations which have an extremely high probability of 
containing a single inserted cassette, a single insertion 

15 is preferable, but not critical, to the present method. 
• insertion of additional cassettes, of different structure 
from the first, is also contemplated. Optionally, one can, 
at this point, verify that each clonal population does 
contain a single insert of a given construction via a 

20 varia *y of conventional techniques, such as assaying 

relative cassette copy number per cell by comparison with a 
known cassette DNA concentration via hybridization, it is 
understood, however, that any technique which will identify 
the clonal lines which contain a single insertion of the 

^ cassette is acceptable for the instant invention. It is 
also understood that insertion of the cassette into the 
genomic DNA, causing integration of the cassette and genome, 
is only erne means of attaining integration. 

30 5.3. DNA ISOLATION 

The clonal populations which contain the genomic 
UNA to be mapped are then lysed in any manner which is 
suitable for the DNA separation method selected. These 
techniques can JLnclude, but are not limited to, prior 
35 suspension of cells in agarose, e.g., agarose microbead 
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technique (cook, P. EMBO Jour^ 3:1837, 1984) and agarose 
block technique (Schwartz, b. and cantor, c, Cell 37-67 
1984 and 0 8. Patent No. 4,473,452,. These tecl^es allow 
IS situ cell lysis by enzymes, detergents and protls 
diffused into the agarose while maintaining DNA integrity 
^method of DNA isolation which leaves the DNA inTstle 
available for subsequent treatment is acceptable (see e o 
Maniatis et al , supra ). ' * 9 ' 



10 5.3.1. SECONDARY DMA TREATMENT 

^ „ ^ i8 ° lation ° f genomic DNA is then 

treated so as to produce fragments suitable for napping, m 
general, the DNA will be cleaved with a restriction enzyme 
having specificity for the rare restriction site, and a* 
16 least one secondary restriction enzyme. In one embodiment, 
as already noted above, the rare restriction site is first 

IIT^Z"* 1 * 8lte - 8pftcific »«thyl.ting enzyme, in order to 
render the restriction site more rare, and then followed by 
cutting with methylation dependent restriction enzyme, m 
20 accordance with a overall strategy of the present method, in 
a preferred embodiment this initial enzyme treatment will 
preferentially cut within the cassette, and will produce 
little or no cleavage within the genomic DNA. 

The DNA is also partially digested with one, or 
25 independently, a series or mixture of secondary restriction 
enzymes, so that each DNA sample will be digested with the 
remtriction enzyme specific for the rare site, and partially 
digested with the secondary restriction enzymes. These 
-condary enzymes will be specific for various sequences 
30 within the genomic DNA. The identity of the enzyme used is 
not critical, and can be any restriction enzyme which cuts 
within the genome to be mapped. However, if it is desired 
to produce larger fragments for mapping, the chosen enzyme 
will preferably be one which cuts a relatively uncommon 
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homologous to one of the unique flanking DHA sites. 
Preferably the complementary sequence will be labelled with 
a radioactive, fluorescent or color indictor. Most 
frequently, the separated DNA fragments will be blotted onto 
5 a support membrane, such as, but not limited to, nylofor 

^Z^°T: Pri ° r ^ ^ accordance with 

the method of Southern. This procedure results in an eM- 
label of only those fragments containing the introduced 
restriction site, producing a ladder of fragments giving the 

10 genomic restriction site pattern away from the i^tLT 
site in one direction. The same blot can be probed with a 
sequence homologous to tae other side of the restriction 
site, (i.e., DNA B, if present), producing a fragment 
pattern representing the genomic restriction sites on the 

15 other side of the integrated cassette. 

Once the restriction fragments have been 

I^T! ^ PhySiCaUy ort ~-» FW1 »appi»g can be 
initiated. The general strategy employed for physical 
mapping is a contig strategy which has been described 

20 previously for mapping of the yeast and nematode genomes 
(Olson et al^, pnas usa 83:7826, «86; Carlson et al., pnas 
USA 83:7821, 1986; Carlson et al,, Nature 335 ^"^^ 
in very general terms, the mapping procedure is as follows: 
One fragment will represent that portion of the 

25 genomic OKA from unique DNA A to a first secondary 

restriction ensyme cut. A second fragment will represent 
the diatance from unique DNA A to a second secondary 
restriction enzyme cut. Deducting the first distance from 
the second distance generates the distance from the first to 

30 the second secondary cuts. For large genomes, the above 
procedure will have to be repeated a number of times, the 
number of time, being dependent on the length of the genome 
to mapped. The clonal maps are then compared in order to 



35 



WM/12891 -22- PCT/US89/01983 

ascertain overlapping portions. By aligning these 

overlapping Portions, a complete map of the genomic DNA can 
be generat-«d. 

5 5 -S. CONSTRUC TING A MAMMALIAN (HITMAN^ m*p 

A b. lef sundry of the procedures followed in 
this method *s found in Figure l. Briefly, the Clai/cial 
overlapping sequence ATCGATCGAT has an estimated frequency 
of occurrence in the human genome of once every 2 x lo 8 

10 base-pairs. This sequence is inserted into the DNA of a 
defective amphotropic retrovirus. The DNA recombinant is 
then transfected intb a cell line harboring a trans acting, 
replication defective copy of the retrovirus (See, e.g., 
Sorge et al^, Mol. Cell. Biol. 4:1730, 1984; Cohn et al!., 

15 M«S «:49, 1981). This allows assembly of RNA containing 
viral particles, which are then exported from the cell, 
this type of construction has the demonstrated capability to 
infect cell lines, but is incapable of post-insertional 
replication. These particles are used to infect, 

20 Mnotonically, untransformed cells which contain the genome 
of interest. Infected cells, containing a copy of the 
virus, are clonally propagated. As proviral insertion is an 
essentially random process, each of the individual clones 
wii£have its genome uniquely marked by a defective virus 

25 containing the Clai/cial sequence. To construct a physical 
nap of the DMA surrounding the retroviral integration site 
in a given done, DNA is isolated and then methylated with 
H.C1« I (McClelland, Nucl. Acids Res. 9:6795-6804, 1981), 
creating a Dpnl site at the overlapping Clai/cial site. 

30 P0 ^ l ^ il ^ cleavage with Dpnl, the DNA is partially digested 
with - a second restriction enzyme and then electrophoresed 
next to appropriate DNA size markers, for example, pulse 
field electrophoresis with a partially annealed A phage 
ladder as size standard, in this approach, electrophoresis 

35 is carried out under conditions which allow resolution of 
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large UNA molecules in order to 

information a. possible „ " *"* MPPing 

blotted and probed wit^'i ! T ^ L ° ^ aoxithm 
from one side^t ci. ^XT' «*« ™ 

5 result in an end ^TtlT ^ WiU 

the introduced Clai/ciax fragments containing 

fram-n^r 7 ; £iaI/S ^ 1 site ' Producing a ladder of 
fragments giving the restriction 

integration site in one dire^ton sT/^f"" ^ 
fro. the other side of the Cla^ai s^T * 
10 a ladde- of fragments represe^* TZ^" IT" 
of the genomic DMA on the other side^, ZT< t * 
provirus T „ ♦.»,<., , f the integrated 

F irus * In tnia fashion, at least 3.1 .on 

»ay be restriction mapped tor eacT^i . " basa - pal « 

Assuming the human «„! f ^ Cel1 is <*ate. 

9 " ia numan genome is 3 billion k«— , 

20 one. th. «J »pT£l£,™ ,1,, ? " PUl " d 
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25 A BME .ticky ended ci.i /asI oligonucleotide, 

5 ' -SXTCCATCGATCOATO-3 ' 
«. .»■«. . . 3 ' -STASCTACCTACCTJlG-5 ' , 

™ ^!!!^L int0 °" ' vector. p Daum .o (Staler 

^f"!"^ • »• P«WinKer cloning 

•it. of puch.n«, 1. opened with ^ „,„ pno.phoryl.tH 

M «. ««a to tmnrtor. th. t eoU .train ,„» . 
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Selection for insert positive transformants win follow the 
strategy of Viera and Messing ( Gene 19:259, 1982). Positive 
clones are grown up and insert size (e.g., number of 
£i£l/ClaI sites inserted) determined by double digestion of 
5 each pcnitive plasmid with Smal and Sail. Each has unique 
site j flankirg the inseu (Stellar, H. and Pirrotta, v., 
sugraj . The insert sizes are then resolved on a native 
polyacrylaaide gel against size standards. Plasmids 
containing 6, 8, and 10 Clal/Clal sites, respectively aw*e 

10 g^own up preparat'vely by standard techniques (Maniatis, 
Moiecular Cloning , Cold Spring Harbor Laboratory, 1982). 
These plasmids are used to micro inject Canton S embryos. 

Embryos obtained from a Canton S (Brown 
University) strain are injected (Zalokar, Microscopies Acta 

IB *£:23i, 1981; Zalokar, Experientla 3 7; 1354, 1981; 

Santamajria, Dev. Bio. 96:285, 1983) with the P element 
construct generated as described above along with the 
transposase positive p*25.7wc (Karess and Rubin, Cell 
28:135, 1984). Positive P element: containing flies are 

2U selected oy addition of 6418 to the growth media as 

described (Steller and Pirotta, supra ) . Positive fly stocks 
are maintained under G418 selection for several generations 
in order to ensure stable P element insertion into the germ 
line. P element insertion can be verified by isolation of 

2g DNA from each positive line and spot blotting (Kafatos et 
al., Noel. Acid Res. 7:1541. 1982) followed by hybridization 
with [32 ]P labeled p element plasmid. 

Each isolated transformed line is maintained for 
production of DNA for restriction mapping from the inserted 

3p P element. OKA can be prepared from overnight collections 
of embxyos which have been dechorionated (Santamaria, in 
Drosophlla. A Practical Approach , D.B. Roberts, ed., IRL 
Press, 1986) prior to embedding in agarose plugs (Schwartz 
and Cantor, Cell 37:67, 1984) for lysis. Lysis is performed 

s in a Tris buffer containing 1 mg/ml proteinase K and 1% 



WO 90/12891- 



-25- 



PCTAJS89/01983 



~rco.i~ (3-10 „1> of ..=„ agaroa. piu, containing th. 

Trii/1 «M M.JEDIA for on. hour, than tha agaroa. alica " 
«uted i„ 10 »i of th. bMf . r contains, 2 oo ,1 of loo 
»m PMsr (ph .„ ylMtnn sulfonyl fluerld>) ^ 

rapa.fi one. again, than ch. fhsf i. washed out of thT 
.lie. by a ona-hour incubation in th. 10 «, Tri. A m l0IA 
solution. A. DHA is Own claan anough for furthar ' 
manipulation. 
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5.f. GENE LOCALIZAriQM 
The method also has utility i„ identifying the 
position of any genomic DKA fragment, f or example, to locate 
the map position of a cloned DKA segment. From rudimentary 

15 ^ "° f C ° nti9S deteralned ^ AS Situ hybridization, 

it is possible to determine specific localization of a gene 
or genes known to fall in given chromosomal regions by 

t™.r "J" " neS repre8entin * «• appropriate contig 
to completion with both MClai/Dpni and Motl. The procedure 

20 is outlined in Figure 3. DMA so cleaved can be pulse- 

•lectrophoresed, blotted and probed sequentially with probes 
hybridizable to the unique DHA to either side of the 
MClai/Dpni site. This identifies MClal/DpnI-MotI fragment 
sizes. The same blot can then be probed with the gene of 

25 interest. This procedure will identify the cell line in 
which the genomic Hotl-Kotl fragment containing the gene has 
been interrupted by retroviral insertion, and to which side 
of th. HClal/Dpni site the gene falls. This localizes the 
gen. to that region of the contig. Failure to identify such 

3q a sequence would suggest repeating the procedure 

substituting a lower frequency cutting system for Notl. 
(e.g., MTaq/Dpnl; McCleland and Nelson, supra). 

The following example illustrates one method for 
generating maps of human DNA. it is understood, however, 

35 that the instant invention is not limited to human cells 
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only. Rather, genomes from any prokaryotic or eukaryotic 
cell or organism can be mapped using the instant invention, 
and . the adaptations required will be readily recognized by 
those skilled in the art, in light of the specification anC 
5 examples. 

CONSTRUCTING A HUMAN MAP 

6.1. VECTOR CONSTRUCTION 
10 Por naa A" mapping a human genome, the strategy 

is to utilize a defective retroviral vector to insert a rare 
restriction site into random positions in the human host 
cells. The rare restriction site selected is the Clal/ciai 
overlapping sequence, which has been estimated to occur at a 
jg frequency in the human genome of about once every 

200,000,000 base pairs (McClelland and Nelson, supra). The 
following overlapping oligonucleotide is synthesized, 
flanked by BamHT cohesive ends: 

.J ; 5'-CATCCATCGATCGATG-3' 

20 3 ' -GTAGCTAGCTACCTAG-5 ' , 

The oligonucleotide is inserted, both singly and 
in tandem arrays, at the unique BamHI into the murine (MuLV) 
retroviral shuttle vector pZipHeo originally described by 
' Cepke (Cell 37*1043, 1984). This vector contains a pBR322 

26 £ r f gin of "Plication, an SV40 origin of replication, and a 
selectable marker, the resistance gene for G418 (neomycin) . 
. . Figure 2 shows the- map of the vector pZipNeo. A number of 
constructs can bemade: a vector containing a single copy of 
the. oligonucleotide inserted (pZipNed28; n-l); an identical 

jq vector wit* a tandem array of three oligonucleotides self- 
liga^ed and then inserted; (pZipNeo84; n-3); and a vector 
with six tandem oligonucleotides inserted (pZipNeol68 ; n»6) . 
Tandem arrays of the sequence are created by ligating the 
insert to itself in the presence of T4 kinase, 32 P ATP and 

38. 
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15 



T4 ligaae. Products corresponding to 3 and 6 ligation 
•vants are isolated by elution from an 8% polyacrylamide gel 
following autoradiography. 

$•2. CASSETTE IKSERTTQM 

cell line ^ T^T* ° NA ' 8 «« «»» transfected into a 
cell line harboring a trans-actin* , plication defective 
copy of the etrovirus (t AX) which allows assembly of 
recombinant rna containing infective viral particles 
pZipNeo DNA can be transfected by, for example, the scrape 
loading technique (Fechheimer, PNAS USA 84:8463, 1987) into 
tUe amphotrophic packag'.ng line. The viral particles 
produced are used to infect, monotonically, clonal human 

cetr^i„ fibr °^ a8t / ine MRC ' 5 ' by inCUbation «* the packaging 
cell line aedia with the fibroblast cells (Cohn * Mulligan, 



20 



6,3 • DMA FRAGMENT PREPARATION 
Treated cells are selected for neomycin 
resistance by culturing with G418 for a period of 2-3 weeks. 
Clones are evident at this time, and are subsequently picked 
and grown up in the presence of G418. The cells are 
harvested at confluence and cast in 1% low melting agarose 
(PMC Corporation) and lysed to make -plugs- according to the 
25 method described in Schwartz and Cantor, supra. Plugs are 
treated for MClai/Dpni digestion, first by cutting into 1/4 
pieces. They are twice washed with Tris/EDTA buffer, then 
twice in Tris/EDTA buffer containing 1 mM pmsf and finally 
twice more in Tris/EDTA buffer. Plugs then are cut into 1/3 
3, slivers and washed in microfuge tubes with Tris/EDTA 
containing l mM s-adenosylmethionine for l hour. MClal 
reactions were incubated at 37-c, usually overnight, in the 
presence of 40% glycerol, l mM SAM, 5 mM DTT in Tris EDTA 
(PH 7.5). The next morning, all reaction buffer was 
35 replaced, additional M Clal units were added and the 
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reaction continued for 4 hours. DNA slices are then washed 
.... bri * flv 1» several volumes of Tris/EDTA, then in Dpnl buffer 
for 1/2 hour. Dpnl reactions proceeded with 25 units of 
enzyme for 2-4 hours at 37 *c. Reactions were halted with 
5 addition of ESP buffer and incubation at 55 *c for 20 
minutes. 

Subsequent digestion reactions with secondary 

! restriction enzymes. are also performed after a washing 
protocol identical to that used prior to MClal/Dpnl 

10 (digestions without addition of a-adenosylmethionine to the 
wash buffers) . Appropriate reactions would be then 
initiated and allowed to proceed overnight at the optimum 
temperature to the extent necessary to effect partial 
degradation of genomic DNA at the optimum temperature. 

15 Subsequently, the reactions are terminated, by addition of 
fresh lysis solution (e.g., l mg/ml Proteinase K/io mN Tris 
(pH 9.0)/o.5 M disodiua E0TA/l% sarcosine) , and then pulse- 
electrophoresed on a 1% agarose gel, then blotted and probed 
with a Xhol-Xhol (Neo) probe created from pZipNeo DNA by 

20 *and«" priming .(BRL) of the fragment. 

Results of a specific application of these 
procedures in which the secondary restriction enzyme was 
allowed to proceed to completion, the reactions terminated, 
and the samples loaded on a constant field electrophoresis 

2g device, eiectrophoresed and probed are shown in Figure 4. 

A»4- POLSED FIELD ELECTROPHORESIS SEPARATION 

In general, however, following these treatments 
tb* generated genomic DNA fragments will be separated by 
30 pulsed field electrophoresis as described in Cantor et al. , 
supra. T l ambd a phage concatemers are employed as 
electrophoresis size markers. Wild type whole lambda phage 
(Ci857sam7) are dialyzed after purification on a cesium 
chloride gradient and subsequently diluted in PBS and mixed 
35 with an equal volume of 1.5% of low gelling agarose (FMC lot 
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#12276) . The molten agarose solution is mixed and then 
Pitted into plastic forms and allowed to solidify making 
agarose 'plugs' (Schwartz, o.c. et ai,, cSHSfiB 47:189 

MO 5 *111Z*< a " SUSPended ln a 301 "tion composed 

°' l r J*Tu7 V " 8arC0Sine/1 Proteinase *yio 

t , (P 0> ^ inCUbatSd at least « hours at 55-- 
*ith gen- ie shaking. Samples are loaded as described 
(Schwartz, D.c. et al,, supra, onto a 1% agarose gel for 
electrophoresis. The gel shown was run for 62 hours at 8 5 
V/cm with a pulse time of 150 seconds on an apparatus made 
in -his laboratory after a modified design of Schwartz and 
cantor. (Waterbury and Lane, WUcl. Acid* »- is :1940 
1987). Pinal DNA concentrations of 0.06-0.60 ^g/a were 
utilized to illustrate the optimum concentration of phage 
DKA for good molecular weight 'ladders'. This technique is 
illustrated in Figure 5, using whole yeast chromosomef 
against a lambda phage ladder. Whole yeast chromosomes are 
prepared in plugs as described, with the exception of 
spheroplasting with zymolas. at a concentration of 2 mg/ml 
20 prior to suspension in molten agarose solution. This figure 
shows a resolution of up to 1250 kb. 

The separated, labelled blot of the fragments are 
exposed to film, the film used to assign molecular weights 
and the order in which they appear to the fragments, and 
25 then examined to recognize restriction pattern overlaps, 
from which a complete genome map can be determined. 

The above invention will find many appplications 
and uses. For example, this invention provides for a fast 
and easy way to generate maps of complex genomes, including 
30 human genomes. 

Maps of genomes can be compared to each other in 
order to detect any differences between them. For example, 
a genome can be mapped and compared to a standard map of 
that genome. This procedure will be important in such areas 
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as prenatal diagnosis of inherited genetic disease / 
identification of induced (acquired) genetic disease, as 
veil as -in prognostic applications. 

In one embodiment, a genomic map is generated ana 
s placed in a data base. Thereafter, any other genomic maps 
generated are compared t .the first map by use of an 
appropriate computer algorithm. 

^ The map can also be used to locate specific 

genes, and to identify normal genes. 

10 * In another example, genomes from various cells in 

the same organism can be mapped and compared to detect for 
differences between them. This will allow for greater 
specificity, since most, if not all, of the genomes should 
be identical to each other, and detailed maps can be 

15 generated without having to account for variability. For 
example, a standard map can be prepared from a normal human 
cell, and that can be compared to map of a neoplastic cell 
from the same individual. This procedure will indicate what 
genetic changes a human cell undergoes as and when it 

2Q becomes cancerous. 

The technique can be used to create a library of 
marked cell lines, or marked whole organisms (plant or 
animal) , each of which represent a particular part of the 
genofc*. 

25 This invention will also find ready application 

in other fields, surt as anthropology and evolutionary 
biology. Maps of genomes from various organisms can be 
generated and compared in order to further the study of 
evolution, other fields will benefit as well, such as 

3Q horticulture, animal husbandry and genetic engineering. 
. The present method also has particular 

significant advantage in its ability to effect map closure 
by non-random extension, at lower resolution, of maps 
produced from contig ends. The inability to close a map is 
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IT^T ° f °< "Win* toctai^.., Ma , t „ 

ract, the present method can be used to 

by these other methods. Cl ° 8e ~ P ' Prepared 

5 7 * DEPOSIT OF MICRQQRCaWTgve 

The pzipneo vector containing a Clal/cial 
overlapping restriction sequence is deposited, in an E a n 
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It is understood that the above uses for the 
instant invention are set forth as examples only. They are 
not meant to be limiting on the instant invention. 
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WHAT IS CLAIMED IS: 

1. A method of mapping genomic dna which 

comprises 

5 (a) integrating a DNA cassette with the genomic 

DNA, said DNA cassette comprising a rare restriction 
sequence flanked at one end by a uniquely identifiable DNA A 
sequence, and optionally at the other end by a uniquely 
identifiable DNA B sequence, wherein each sequence is 
10 different from the others, and wherein they are 
distinguishable from the genomic L.IA; 

(b) cutting the genomic dna at the rare 
restriction sequence, and at least one secondary restriction 
sequence to produce fragments; 

15 < e > identifying fragments containing a uniquely 

identifiable sequence; 

(d) measuring distance between the uniquely 
identifiable sequence and the secondary restriction site; 

(e) calculating therefrom distances between 
20 different secondary restriction sites; and 

(f ) generating a map therefrom, the map 
comprising the distances between secondary restriction 
sites. 

25 2 * *• Beth °d of claim l wherein the fragments 

produced in step (b) are separated by pulsed field 
electrophoresis. 

3. The method of claim l wherein the genome to 
be mapped is a vertebrate genome. 

4. The method of claim l wherein the genome to 
be mapped is a mammalian genome. 
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5. The method of claim 1 wh#*«»<„ * ^ 
achieved by insertion of the cassette ^Z l ^™ U 
of a cell. ■■■•we into the genomic DNA 

5 6. The method of claim 3 wherein 

be mapped is a mammalia,. genome. to 



7. The method of claim 6 whar»< n i 

u -=".v. a * v .« or triMmi .„ lon : f t. re i. n .!!:.. iMettia " 

10 

8. The method of claim ; wherein t* vector is 



of the cassette, 

8 - The method of claim 

virus. 

9. The method of claim 8 wherein the vector is a 



15 retrovirus. 

20 11. The method of claim io whan»i„ «.». 

a pZipNeo. wherein the vector is 



12. The method of any one of claims 3-u wherein 
the rare sequence is a Clal/Clal restriction site 

25 

13. The method of claim 12 wherein the cial site 
lm treat* with MClal prior to cutting. 

14. The method of claim 13 wherein the rare 
30 sequence is cut with Dpni. 

15. The method of claim 1 wherein the rare 
sequence is present in a tandem array. 
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16. The method of claim 12 wherein the rare 
sequence is present in a tandem array. 

17. The method of any one of claims 3-11 wherein 
5 the rare sequence is a NotI site in tandem array. 

18. The method of claim 1 wherein the rare 
sequence is a D loop. 

10 19 • 1110 method of claim 1 wherein the rare 

sequence is a triple helix. 

20. The method of any one of claims 3-n or 15 
wherein the genomic DNA is human genomic DNA. 

21. The method of claim 12 wherein the genomic 
DNA is human genomic DNA. 

22. The method of claim 13 wherein the genomic 
2Q DNA is human genomic DNA. 

23. The method of claim 14 wherein the genomic 
DNA is human genomic DNA. 

25 24. The method of claim 17 wherein the genomic 

DNA is human genomic DNA. 

25. The method of claim 18 wherein the genomic 
DBA is human genomic DNA. 

30 

26. The method of claim 19 wherein the genomic 
DNA is human genomic DNA. 



27. The method of claim l wherein the genome to 
2g be mapped is an insect genome. 
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inteora* J*' ° f 27 Whe «^ the DNA is 

integrated by insertion of the cassette into a ceil 
containing the genomic DNA. 

5 aehl^rf k 29 ' Mth ° d ° f Clain 28 wherain insertion is 

achieved by vector transmission of the cassette. 



a P element. 



30. The method of claim 29 wherein the vector is 



10 



puchsneo, 



M. The ,Bthod of claim 30 wherein the vector is 



32. The method of claim 30 or 31 wherein the rare 
15 sequence is a Clal/ciai restriction site. 

i s «- ./ 3 ; 1110 TOth0d ° £ Claia 32 ^««in the Clal site 
is treated with HClal prior to cutting. 

20 34 • method of claim 32 wherein the rare 

sequence is cut with Dpnl. 

35. The method of claim 32 wherein the rare 
sequence is present in a tandem array. 

25 

36. The method of claim 1 wherein the genome to 
be mapped is a bacterial genome. • 

* v 37 ; *• ° f Clala 36 Wherein integration is 

30 achieved by insertion of the cassette into a cell containing 
the genomic DNA. 

38. The method of claim 36 wherein the insertion 
is achieved by vector transmission of the cassette DNA. 
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39. The method of claim 38 wherein the vector is 
a transposon. 

40. The method of claim 1 wherein the genomic DNA 
s is plant genomic DNA. 

41. The method of claim 40 wherein integration is 
achieved by insertion of the cassette into a cell containing 
the genomic DNA* 

10 

^ 42. The method of claim 41 wherein the insertion 
is achieved by vector transmission of the cassette DNA. 

43. The method of claim 42 wherein the vector is 
15 a Ti plasmid. 

44. The method of claim 1 wherein the genomic DNA 
is yeast DNA. 

2Q 45. The method of claim 44 wherein integration is 

achieved by insertion of the cassette into a cell containing 
the genomic DNA. 

46. The method of claim 45 wherein the insertion 
jg is achieved by vector transmission of the cassetue DNA. 

47. me method of claim 46 wherein the vector is 
a Ty element. 

3Q 48. A DNA cassette comprising a restriction 

sequence rare in the vertebrate genome, flanked on at least 
one side by bacterial or viral DNA sequence. 



49. The cassette of claim 48 wherein the 
vertebrate is a mammal. 
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50. The cassette of claim 49 wherein the flankina 
sequence is a viral sequence. 

51. The cassette of clai» 50 wherein the flanking 
5 sequence is a retroviral sequence. 

52. The Cassette of claim 51 wherein the flanking 
sequence is substantially homologous to a sequence found in 
the vector pZipNeo. 

10 

53. The cassette of an.,- one of claims 48-52 
wherein the rare restriction sequence is Clal/ciai. 

54. The cassette of claim 48 wherein the rare 
1S restriction sequence is present in a tandem array. 

55. The cassette of claim 53 wherein the rare 
restriction sequence is present in a tandem array. 

20 56. The cassette of claim 53 wherein the 

Clal/Clal sequence has been methylated by MClal. 

57. The cassette of claim 55 wherein the 
Clal/Clal sequence has been methylated by KClal. 

25 

38. A DHA cassette comprising a restriction 
sequence rare in an insect genome; being flanked on at least 
ana side by a P element DNA sequence. 

30 59. The cassette of claim 58 wherein the insect 

is Droeophila . 
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60. The cassette of claim 58 or 59 wherein the 
flanking sequence is substantially homologous to a sequent 
found in the vector pUChsneo. 



WO90/12891 PCTAJS89/01983 

-40- 

61. The cassette of claim 58 or 59 wherein the 
rare restriction sequence is a Clal/Clai restriction site. 

62. The cassette of claim 60 wherein the rare 
5 restriction sequence is rresent in a tandem array. 

53. The cassette of claim 6_ wherein the rare 
restriction sequence is present in a tandem array. 

10 64 • cassette of claim 62 wherein the 

iilal/Clal sequence has been methylated by HClal. 

65. The cassette of claim 63 wherein the 
Clal/cial sequence has been methylated by MClal. 



15 



66. A DNA cassette comprising a restriction 
sequence rare in the plant genome flanked on at least one 
side by a Ti plasmid DNA sequence. 

20 . ,. 67 • B» cassette of claim 66 wherein the rare 

restriction sequence is present in a tandem array. 

68* A DNA cassette comprising a restriction 
sequence rare in the yeast genome flanked on at least one 
2g side by a Ty element OKA sequence. 

69. The cassette of claim 68 wherein the rare 
restriction sequence is present in a tandem array. 

3q 70. A DNA cassette comprising a restriction 

sequence rare in the bacterial genome flanked on at least 
one side by a transposon DNA sequence. 
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71. The cassette of claim 70 wherein the rare 
restriction sequence is present in a tandem array. 
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72. a vector comprising the cassette of any one 
of claims 48-52. 

73. a vector comprising the cassette of claim 53, 

5 

74. A vector comprising the cassette of claim 55. 

75. A vector comprising the cassette of claim 56. 



10 76. A vector comprising the cassette of claim 

or 59. 



15 



20 



or 67. 

25 

or 69. 



30 0r71 ' 



58 



77. 


A vector 


comprising the 


cassette 


of claim 60. 


78. 


A vector 


comprising the 


cassette 


of claim 61. 


79. 


A vector 


comprising the 


cassette of claim 62. 


80. 


A vector 


comprising the 


cassette 


of claim 63. 


81. 


A vector 


comprising the cassette 


of claim 65. 


82. 


A vector 


comprising the 


cassette 


of claims 66 



83. A vector comprising the cassette of claims 68 

84. A vector comprising the cassette of claims 70 



85. A vertebrate cell having integrated into its 
genome the cassette of any one of claims 48-52. 
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86. The cell of claim 85 wherein the vertebrate 
is a mammal. 

87. The cell of claim 86 wherein the mammal is a 

5 human* 

88. A vertebrate cell having integrated into its 
genome the cassette of any one of claims 48-52. 

10 89. A vertebrate cell having integrated into its 

genome the cassette of claim 53. 

90. A vertebrate cell having integrated into its 
genome- the cassette of claim 54. 

15 ' 

91. A vertebrate cell having integrated into its 
; genome the cassette of claim 55. 

92. A vertebrate cell having integrated into its 
jg genome the cassette of claim 56. 

93. A vertebrate cell having integrated into its 
genome the cassette of claim 57. 

2g 94. An insect cell having incorporated into its 

genome the cassette of claim 58 or 59. 

95. An insect cell having incorporated into its 
genome the cassette of claim 60. 

30 

96. An insect cell having incorporated into its 
* genome the cassette of claim 61. 

97. An insect cell having incorporated into its 
genome the cassette of claim 62. 
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«. ^ * ** ln8eCt haVing incor P°rated into its 

genome the cassette of claim 63. 

99. An insect ceil having incorporated into its 
5 genome the cassette of claim 64. 

_ * 00 ' *» t*"* cel1 incorporated into its 

genome the cassette of claim 65. 

1 ° 101 * A P lant cel l ^ving integrated into its 

genome the cassette of claim 66 or 67. 

102. A yeast cell having integrated into its 
genome the cassette of claim 68 or 69. 

15 

103. A bacterial cell having integrated into its 
genome the cassette of claim 70 or 71. 

104. A continuous genomic map prepared according 
20 to *«thod of claims l-n or 15. 

105. A continuous genomic map prepared according 
to the method of claim 12. 

25 106. A continuous genomic map prepared according 

to the method of claim 14. 

107. A continuous genomic map prepared according 
to the method of claim 20. 



30 



108. A continuous genomic map prepared according 
to the method of claim 27. 



109. A continuous genomic map prepared according 
3g to the method of claim 29. 
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110. A continuous genomic nap prepared according 
to the method of claim 37. 

111. A continuous genomic map prepared according 
5 to th* method of claim 41.. 

112. A continuous genomic map prepared according 
to the method 44. 

10 



15 



20 



26 



30 
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